94 research outputs found

    Investigating the Effects of Word Substitution Errors on Sentence Embeddings

    Full text link
    A key initial step in several natural language processing (NLP) tasks involves embedding phrases of text to vectors of real numbers that preserve semantic meaning. To that end, several methods have been recently proposed with impressive results on semantic similarity tasks. However, all of these approaches assume that perfect transcripts are available when generating the embeddings. While this is a reasonable assumption for analysis of written text, it is limiting for analysis of transcribed text. In this paper we investigate the effects of word substitution errors, such as those coming from automatic speech recognition errors (ASR), on several state-of-the-art sentence embedding methods. To do this, we propose a new simulator that allows the experimenter to induce ASR-plausible word substitution errors in a corpus at a desired word error rate. We use this simulator to evaluate the robustness of several sentence embedding methods. Our results show that pre-trained neural sentence encoders are both robust to ASR errors and perform well on textual similarity tasks after errors are introduced. Meanwhile, unweighted averages of word vectors perform well with perfect transcriptions, but their performance degrades rapidly on textual similarity tasks for text with word substitution errors.Comment: 4 Pages, 2 figures. Copyright IEEE 2019. Accepted and to appear in the Proceedings of the 44th International Conference on Acoustics, Speech, and Signal Processing 2019 (IEEE-ICASSP-2019), May 12-17 in Brighton, U.K. Personal use of this material is permitted. However, permission to reprint/republish this material must be obtained from the IEE

    Simulating dysarthric speech for training data augmentation in clinical speech applications

    Full text link
    Training machine learning algorithms for speech applications requires large, labeled training data sets. This is problematic for clinical applications where obtaining such data is prohibitively expensive because of privacy concerns or lack of access. As a result, clinical speech applications are typically developed using small data sets with only tens of speakers. In this paper, we propose a method for simulating training data for clinical applications by transforming healthy speech to dysarthric speech using adversarial training. We evaluate the efficacy of our approach using both objective and subjective criteria. We present the transformed samples to five experienced speech-language pathologists (SLPs) and ask them to identify the samples as healthy or dysarthric. The results reveal that the SLPs identify the transformed speech as dysarthric 65% of the time. In a pilot classification experiment, we show that by using the simulated speech samples to balance an existing dataset, the classification accuracy improves by about 10% after data augmentation.Comment: Will appear in Proc. of ICASSP 201

    The Ecology Of Drift Algae In The Indian River Lagoon, Florida

    Get PDF
    To gain an understanding of the ecology of drift algae in the Indian River Lagoon system along the east coast of central Florida, four questions were addressed: 1) What is the composition and rate of accumulation of drift? 2) How much movement and turnover occurs within drift accumulations? 3) Do growth rates differ for drift versus attached algae? 4) Is there a difference in photosynthetic performance in drift versus attached algal species? Manipulative field and laboratory experiments were conducted to address these questions with the green macroalga Codium decorticatum and the red macroalga Gracilaria tikvahiae. Changes in pigment concentration and biomass were used as indicators of acclimation from an attached to drift state in Gracilaria tikvahiae and Codium decorticatum. Short-term physiological changes as demonstrated by electron transport rate (ETR) were also used as indications of acclimation from an attached to drift state in C. decorticatum. Composition and rate of accumulation of drift varied by season. While both transport and turnover of drift occurred, turnover within drift accumulations occurred at low rates and was significantly lower in the spring during decreased flow rates. There were no significant differences in growth or pigment concentrations in drift versus attached G. tikvahiae or C. decorticatum. In addition, there were no apparent physiological acclimations to a drift state in C. decorticatum

    A Cognitive-Perceptual Approach to Conceptualizing Speech Intelligibility Deficits and Remediation Practice in Hypokinetic Dysarthria

    Get PDF
    Hypokinetic dysarthria is a common manifestation of Parkinson's disease, which negatively influences quality of life. Behavioral techniques that aim to improve speech intelligibility constitute the bulk of intervention strategies for this population, as the dysarthria does not often respond vigorously to medical interventions. Although several case and group studies generally support the efficacy of behavioral treatment, much work remains to establish a rigorous evidence base. This absence of definitive research leaves both the speech-language pathologist and referring physician with the task of determining the feasibility and nature of therapy for intelligibility remediation in PD. The purpose of this paper is to introduce a novel framework for medical practitioners in which to conceptualize and justify potential targets for speech remediation. The most commonly targeted deficits (e.g., speaking rate and vocal loudness) can be supported by this approach, as well as underutilized and novel treatment targets that aim at the listener's perceptual skills

    ALS Longitudinal Studies With Frequent Data Collection at Home: Study Design and Baseline Data

    Get PDF
    Objective: To design an ALS clinical study in which patients are remotely recruited, screened, enrolled and then assessed via daily data collection at home by themselves or caregivers. Methods: This observational, natural-history study included two academic medical centers, one providing overall clinical management and the other overseeing computing and web-services design and management. Both healthy and ALS subjects were recruited on the Internet via advertisement on governmental and foundation websites as well as through Facebook and Google paid advertisements. Individuals underwent screening and enrollment remotely, including signing an electronic informed consent form. Participants were then provided self-measurement equipment and instructed on their use through a series of web-based videos. The equipment included a handgrip dynamometer, spirometer with smartphone connection, electrical impedance myography device, and an activity tracker. ALS Functional Rating Scale-Revised data were also collected. Subjects were asked to collect data daily for three months and twice-weekly for the subsequent six months. Results: One hundred and eleven ALS patients and 30 healthy individuals enrolled in the study from across 41 states (74 men, 62 women). Baseline median ALSFRS-R score was 33. Seventy two percent of the ALS patients sent equipment and 88% of the healthy subjects sent equipment were able to complete a first set of measurements. Expected baseline differences between the ALS patients and healthy participants were identified for all measures. Conclusions: It is possible to design and institute an at-home based study in ALS patients, using a number of state-of-the-art approaches, including web-based consenting and training and Internet-connected measurement devices
    corecore